verification mechanism
SelecTKD: Selective Token-Weighted Knowledge Distillation for LLMs
Huang, Haiduo, Song, Jiangcheng, Zhang, Yadong, Ren, Pengju
Knowledge distillation (KD) is a standard route to compress Large Language Models (LLMs) into compact students, yet most pipelines uniformly apply token-wise loss regardless of teacher confidence. This indiscriminate supervision amplifies noisy, high-entropy signals and is especially harmful under large teacher-student capacity gaps. We introduce SelecTKD, a plug-and-play Selective Token-Weighted distillation framework that shifts the focus from "how to measure divergence" to "where to apply learning". At each step, the student proposes tokens that are verified by the teacher through a robust propose-and-verify procedure with two variants: greedy Top-k and non-greedy Spec-k. Accepted tokens receive full loss, while rejected tokens are masked or down-weighted. This objective-agnostic design works with on- and off-policy data, induces an implicit curriculum quantified by Token Acceptance Rate (TAR), and stabilizes optimization. Across instruction following, mathematical reasoning, code generation, and a VLM setting, SelecTKD consistently improves strong baselines and achieves state-of-the-art results for small models without architectural changes or extra reference models.
Toward Understanding Security Issues in the Model Context Protocol Ecosystem
The Model Context Protocol (MCP) is an emerging open standard that enables AI-powered applications to interact with external tools through structured metadata. A rapidly growing ecosystem has formed around MCP, including a wide range of MCP hosts (i.e., Cursor, Windsurf, Claude Desktop, and Cline), MCP registries (i.e., mcp.so, MCP Market, MCP Store, Pulse MCP, Smithery, and npm), and thousands of community-contributed MCP servers. Although the MCP ecosystem is gaining traction, there has been little systematic study of its architecture and associated security risks. In this paper, we present the first comprehensive security analysis of the MCP ecosystem. We decompose MCP ecosystem into three core components: hosts, registries, and servers, and study the interactions and trust relationships among them. Users search for servers on registries and configure them in the host, which translates LLM-generated output into external tool invocations provided by the servers and executes them. Our qualitative analysis reveals that hosts lack output verification mechanisms for LLM-generated outputs, enabling malicious servers to manipulate model behavior and induce a variety of security threats, including but not limited to sensitive data exfiltration. We uncover a wide range of vulnerabilities that enable attackers to hijack servers, due to the lack of a vetted server submission process in registries. To support our analysis, we collect and analyze a dataset of 67,057 servers from six public registries. Our quantitative analysis demonstrates that a substantial number of servers can be hijacked by attackers. Finally, we propose practical defense strategies for MCP hosts, registries, and users. We responsibly disclosed our findings to affected hosts and registries.
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Portugal > Coimbra > Coimbra (0.04)
- Asia (0.04)
MLE-Smith: Scaling MLE Tasks with Automated Multi-Agent Pipeline
Qiang, Rushi, Zhuang, Yuchen, Singh, Anikait, Liang, Percy, Zhang, Chao, Yang, Sherry, Dai, Bo
While Language Models (LMs) have made significant progress in automating machine learning engineering (MLE), the acquisition of high-quality MLE training data is significantly constrained. Current MLE benchmarks suffer from low scalability and limited applicability because they rely on static, manually curated tasks, demanding extensive time and manual effort to produce. We introduce MLE-Smith, a fully automated multi-agent pipeline, to transform raw datasets into competition-style MLE challenges through an efficient generate-verify-execute paradigm for scaling MLE tasks with verifiable quality, real-world usability, and rich diversity. The proposed multi-agent pipeline in MLE-Smith drives structured task design and standardized refactoring, coupled with a hybrid verification mechanism that enforces strict structural rules and high-level semantic soundness. It further validates empirical solvability and real-world fidelity through interactive execution. We apply MLE-Smith to 224 of real-world datasets and generate 606 tasks spanning multiple categories, objectives, and modalities, demonstrating that MLE-Smith can work effectively across a wide range of real-world datasets. Evaluation on the generated tasks shows that the performance of eight mainstream and cutting-edge LLMs on MLE-Smith tasks is strongly correlated with their performance on carefully human-designed tasks, highlighting the effectiveness of the MLE-Smith to scaling up MLE tasks, while maintaining task quality.
- Oceania > Australia (0.14)
- North America > United States (0.04)
- Leisure & Entertainment (1.00)
- Media (0.93)
- Education > Curriculum > Subject-Specific Education (0.88)
WSI-Agents: A Collaborative Multi-Agent System for Multi-Modal Whole Slide Image Analysis
Lyu, Xinheng, Liang, Yuci, Chen, Wenting, Ding, Meidan, Yang, Jiaqi, Huang, Guolin, Zhang, Daokun, He, Xiangjian, Shen, Linlin
Whole slide images (WSIs) are vital in digital pathology, enabling gigapixel tissue analysis across various pathological tasks. While recent advancements in multi-modal large language models (MLLMs) allow multi-task WSI analysis through natural language, they often underperform compared to task-specific models. Collaborative multi-agent systems have emerged as a promising solution to balance versatility and accuracy in healthcare, yet their potential remains underexplored in pathology-specific domains. To address these issues, we propose WSI-Agents, a novel collaborative multi-agent system for multi-modal WSI analysis. WSI-Agents integrates specialized functional agents with robust task allocation and verification mechanisms to enhance both task-specific accuracy and multi-task versatility through three components: (1) a task allocation module assigning tasks to expert agents using a model zoo of patch and WSI level MLLMs, (2) a verification mechanism ensuring accuracy through internal consistency checks and external validation using pathology knowledge bases and domain-specific models, and (3) a summary module synthesizing the final summary with visual interpretation maps. Extensive experiments on multi-modal WSI benchmarks show WSI-Agents's superiority to current WSI MLLMs and medical agent frameworks across diverse tasks.
- Asia > China > Hong Kong (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.04)
- (3 more...)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Diagnostic Medicine (1.00)
RVISA: Reasoning and Verification for Implicit Sentiment Analysis
Lai, Wenna, Xie, Haoran, Xu, Guandong, Li, Qing
With an increasing social demand for fine-grained sentiment analysis (SA), implicit sentiment analysis (ISA) poses a significant challenge with the absence of salient cue words in expressions. It necessitates reliable reasoning to understand how the sentiment is aroused and thus determine implicit sentiments. In the era of Large Language Models (LLMs), Encoder-Decoder (ED) LLMs have gained popularity to serve as backbone models for SA applications, considering impressive text comprehension and reasoning ability among diverse tasks. On the other hand, Decoder-only (DO) LLMs exhibit superior natural language generation and in-context learning capabilities. However, their responses may contain misleading or inaccurate information. To identify implicit sentiment with reliable reasoning, this study proposes RVISA, a two-stage reasoning framework that harnesses the generation ability of DO LLMs and the reasoning ability of ED LLMs to train an enhanced reasoner. Specifically, we adopt three-hop reasoning prompting to explicitly furnish sentiment elements as cues. The generated rationales are utilized to fine-tune an ED LLM into a skilled reasoner. Additionally, we develop a straightforward yet effective verification mechanism to ensure the reliability of the reasoning learning. We evaluated the proposed method on two benchmark datasets and achieved state-of-the-art results in ISA performance.
- Oceania > Australia > New South Wales > Sydney (0.14)
- Asia > China > Hong Kong (0.05)
- North America > Canada > Ontario > Toronto (0.04)
- (6 more...)
Unified High-binding Watermark for Unconditional Image Generation Models
Ma, Ruinan, Tan, Yu-an, Wu, Shangbo, Chen, Tian, Wang, Yajie, Li, Yuanzhang
Deep learning techniques have implemented many unconditional image generation (UIG) models, such as GAN, Diffusion model, etc. The extremely realistic images (also known as AI-Generated Content, AIGC for short) produced by these models bring urgent needs for intellectual property protection such as data traceability and copyright certification. An attacker can steal the output images of the target model and use them as part of the training data to train a private surrogate UIG model. The implementation mechanisms of UIG models are diverse and complex, and there is no unified and effective protection and verification method at present. To address these issues, we propose a two-stage unified watermark verification mechanism with high-binding effects for such models. In the first stage, we use an encoder to invisibly write the watermark image into the output images of the original AIGC tool, and reversely extract the watermark image through the corresponding decoder. In the second stage, we design the decoder fine-tuning process, and the fine-tuned decoder can make correct judgments on whether the suspicious model steals the original AIGC tool data. Experiments demonstrate our method can complete the verification work with almost zero false positive rate under the condition of only using the model output images. Moreover, the proposed method can achieve data steal verification across different types of UIG models, which further increases the practicality of the method.
- Information Technology > Security & Privacy (1.00)
- Law (0.67)
Proof-of-Learning is Currently More Broken Than You Think
Fang, Congyu, Jia, Hengrui, Thudi, Anvith, Yaghini, Mohammad, Choquette-Choo, Christopher A., Dullerud, Natalie, Chandrasekaran, Varun, Papernot, Nicolas
Proof-of-Learning (PoL) proposes that a model owner logs training checkpoints to establish a proof of having expended the computation necessary for training. The authors of PoL forego cryptographic approaches and trade rigorous security guarantees for scalability to deep learning. They empirically argued the benefit of this approach by showing how spoofing--computing a proof for a stolen model--is as expensive as obtaining the proof honestly by training the model. However, recent work has provided a counter-example and thus has invalidated this observation. In this work we demonstrate, first, that while it is true that current PoL verification is not robust to adversaries, recent work has largely underestimated this lack of robustness. This is because existing spoofing strategies are either unreproducible or target weakened instantiations of PoL--meaning they are easily thwarted by changing hyperparameters of the verification. Instead, we introduce the first spoofing strategies that can be reproduced across different configurations of the PoL verification and can be done for a fraction of the cost of previous spoofing strategies. This is possible because we identify key vulnerabilities of PoL and systematically analyze the underlying assumptions needed for robust verification of a proof. On the theoretical side, we show how realizing these assumptions reduces to open problems in learning theory.We conclude that one cannot develop a provably robust PoL verification mechanism without further understanding of optimization in deep learning.
- Research Report (1.00)
- Workflow (0.93)
- Government (0.67)
- Information Technology > Security & Privacy (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
If you're only using AI for chatbots, you're falling behind
Chatbots and call center support are table stakes when it comes to implementing artificial intelligence, according to a new report. Cognizant's Center for the Future of Work found that leaders are using the automated analysis for decision support, automated reasoning, text mining, and object and speech recognition. The research includes a survey of 1,000 business leaders that measures how organizations view the potential of AI and their plans for deploying AI-enabled tools. The data modernization report describes a vicious cycle that keeps companies from doing more with AI. A company has a low level of trust in the potential of artificial intelligence.
Towards Probabilistic Verification of Machine Unlearning
Sommer, David Marco, Song, Liwei, Wagh, Sameer, Mittal, Prateek
Right to be forgotten, also known as the right to erasure, is the right of individuals to have their data erased from an entity storing it. The General Data Protection Regulation in the European Union legally solidified the status of this long held notion. As a consequence, there is a growing need for the development of mechanisms whereby users can verify if service providers comply with their deletion requests. In this work, we take the first step in proposing a formal framework to study the design of such verification mechanisms for data deletion requests -- also known as machine unlearning -- in the context of systems that provide machine learning as a service. We propose a backdoor-based verification mechanism and demonstrate its effectiveness in certifying data deletion with high confidence using the above framework. Our mechanism makes a novel use of backdoor attacks in ML as a basis for quantitatively inferring machine unlearning. In our mechanism, each user poisons part of its training data by injecting a user-specific backdoor trigger associated with a user-specific target label. The prediction of target labels on test samples with the backdoor trigger is then used as an indication of the user's data being used to train the ML model. We formalize the verification process as a hypothesis testing problem, and provide theoretical guarantees on the statistical power of the hypothesis test. We experimentally demonstrate that our approach has minimal effect on the machine learning service but provides high confidence verification of unlearning. We show that with a $30\%$ poison ratio and merely $20$ test queries, our verification mechanism has both false positive and false negative ratios below $10^{-5}$. Furthermore, we also show the effectiveness of our approach by testing it against an adaptive adversary that uses a state-of-the-art backdoor defense method.
- North America > United States > California (0.14)
- North America > United States > New York > Erie County > Buffalo (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
Disarming the world
For centuries, following bloody conflicts, military leaders acknowledged that some weapons were simply too awful to be used, but those same militaries generally continued to use them. The first world war (WWI) saw the successful deployment of chemical weapons on a massive scale for the first time. The horror of millions of dead soldiers, in trenches and on battlefields, shocked nations into signing the Geneva Protocol pledging to refrain from the use of chemical and biological weapons in future wars. Over the past century, the weapons that cause "unjustifiable" suffering in an indiscriminate and "unpredictable" manner have been subject to multilateral treaties that aim to disarm countries that possess them and control or ban the use of these weapons altogether. While some may feel sceptical that these efforts to disarm the world are effective, and challenges to disarmament remain, the disarmament treaties serve a key role in the regulation and reduction in stockpiles, as well as in the testing and use of certain weapons in conflicts.